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Abstract 

In this paper we take up Bayesian inference in general multivariate stable distributions. We exploit the repre¬ 
sentation of Matsui and Takemura (2009) for univariate projections, and the representation of the distributions 
in terms of their spectral measure. We present efficient MCMC schemes to perform the computations when the 
spectral measure is approximated discretely or, as we propose, by a normal distribution. Appropriate latent 
variables are introduced to implement MCMC. In relation to the discrete approximation, we propose efficient 
computational schemes based on the characteristic function. 
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1 Introduction 


Univariate stable distributions have been thoroughly studied in econometrics, statistics and finance over the past 
few decades (Samorodnitsky and Taqqu, 1994). Their empirical application is still hampered by the fact that their 
density is not available in analytic form, despite advances in Bayesian computation using MCMC. Buckle (1995) 
and Tsionas (1999) provided Gibbs sampling schemes for general and symmetric stable distributions, respectively. 
The problem is that the conditional posterior distributions of certain latent variables are cumbersome to work with 
and require careful tuning. The analogous problem in the multivariate case is exceedingly difficult although a few 
attempts have been made to solve it. The impediment is that multivariate stable distributions, unlike the univariate 
case, are defined through their spectral measures which, in practice, are unknown. Ravishanker and Qiou (1999) for 
example, proposed an EM algorithm based on Buckle (1995) in the case of symmetric isotropic stable distributions 
but this class is too narrow to be of empirical importance. It is defined by the transformation X = p + C£, where 
£ is a vector of independent random variables each one distributed as standard symmetric stable, /i is a vector of 
location parameters, E is a scale matrix, and C T C = E. It is known that the class of elliptical stable distributions 
can be defined through the transformation X = p + RCu where u is uniformly distributed on the unit sphere 
S d_1 = {x e R d | ||ce|| = 1}, C is a d x d scale matrix of full rank, and R = ^JVS a /2 where, independently, 
V ~ Xd an( l Sa /2 follows a stable distribution with parameter a/2 and maximal skewness /3 = 1. Of course not all 
multivariate stable distributions are elliptical. See Lombardi and Veredas (2009). When V ~ \i the distribution of 
X is in the class of elliptically contoured stable distributions (Nolan, 2006, p.2). 

In connection with multivariate stable Paretian distributions, even the computation of the characteristic func¬ 
tions becomes complicated because they are only defined through their spectral measure, an object that is needed in 
order to retain the equivalence between the density and the characteristic function. The estimation of the spectral 
measure itself has proved itself to be quite cumbersome even for bivariate distributions (see the seminal works of 
McCulloch, 1994, 2000, Nolan et ah, 2001, and Nolan and Rajput, 1995). 

The present paper is related to recent advances in the econometrics of stable distributions. Dominicy and 
Veredas (2012) propose a method of quantiles to fit symmetric stable distributions. Since the quantiles are not 
available in closed form they are obtained using simulation resulting in the method of simulated quantiles or MSQ. 
Hallin, Swan, Verdebout and Veredas (2012) propose an easy-to-implement R-estimation procedure which remains 
-consistent contrary to least squares with stable disturbances. Broda, Haas, Krause, Paolella and Steude (2012) 
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propose a new stable mixture GARCH model that encompasses several alternatives and can be extended easily to the 
multivariate asset returns case using independent components analysis. Ogata (2012) uses a discrete approximation 
to the spectral measure of multivariate stable distributions and proposes estimating the parameters by equating the 
theoretical and empirical characteristic function in a generalized empirical likelihood / GMM framework. 

Relative to this work, we show how to implement Bayesian inference for multivariate stable distributions by 
providing statistical inferences about the spectral measure jointly with the other parameters of the model. For 
numerical analysis via MCMC we employ a novel data augmentation technique for stable distributions. We use a 
discrete approximation of the measure where the configuration and the number of points are unknown. We also 
propose a novel approximation to the spectral measure based on a multivariate normal distribution. 

2 Stable distributions 

A random variable X is called strictly (univariate) stable if for all n, X^"=i ~ c nX for some constant c ni where 

Xi, ...,X n are independently distributed with the same distribution as X. It is known that the only possible choice 
is to have c n = n 1 /“ for some a £ (0,2). General non-symmetric stable distributions are defined via the log 
characteristic function which is given by the following expression (Samorodnitsky and Taqqu, 1994, and Zolotarev, 
1986): 


logt/?(r) = logAexp(trX) = 

tfiT — |(tt|“{1 — t/3sgn(r) tan ^}, a ^ 1 (1) 

l\xt - cr|r|{l + (./3sgn(r)f log |r|}, a = 1, 

where r R, /i and a are location and scale parameters, a is the characteristic exponent, j3 £ [—1,1] is the 
skewness parameter, and l = \/—l. In this paper we are interested in multivariate stable distributions, that is 
distributions of a random variable in R d . Suppose A is a vector of random variables with characteristic exponent 
a £ (0, 2]. Its characteristic function is ^x{t) = -Eexpjt < t, X >) = exp (—Ix(r) + i < r, fi >) where < r, X >= 
t t X denotes inner product, and 


Ix(t)=[ Ipa (< T,s >)r(ds), 

Js d ~ 1 


( 2 ) 
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where S d_1 = {it £ R d | < it, it >= 1} is the boundary of the unit ball in R d , T is a finite Borel measure of the 
vector X, called the spectral measure, /r £ R. d is a vector of location parameters, and the complex function ip is 
defined as follows: 

i |u|“{l — tsgn(zi) tan ^}, a ^ 1 , 

( 3 ) 

Mi 1 + (-f s g n (' u ) l°g M}> a = 1 ■ 

See seminal work by Nolan (1998), Nolan and Rajput (1995), Abdul-Hamid and Nolan (1998), and also Cambanis 
and Miller (1981), and Nagaev (2000). Notably the parameters (a,F) fully define all centered multivariate stable 
distributions, and a skewness parameter /3 is not needecjj] in this case, since we have the full measure, F. We denote 
the class by A' ~ ^a,d(/U,T). Press (1972) attempted to define a multivariate a-stable distribution without using 
the spectral measure T. Later on Paulauskas (1976) provided some corrections as not all a-stable distributions can 
be represented using Press’ (1972) characteristic function. Chen and Rachev (1995) is an interesting paper where 
the authors provided estimates of the spectral measure as well as applications to a stable portfolio. It is notable 
that the projection of X on r, viz. < t,X > has a univariate stable distribution. The characteristic exponent a 
remains the same but scale, location and skewness depend on T.The multivariate characteristic function is not easy 
to work with as in the univariate case because of the dependence on the spectral measure. As this can rarely be 
specified in advance, it is necessary to provide posterior inferences about it, in the context of Bayesian analysis. 

One approach (Byczkowski et al., 1993) is to assume that T can be approximated by a discrete measure, in 
which case we have: 


J 

T ( ds ) = ^2'7j S { aU)}(ds), (4) 

1=i 

where 7 j > 0, £ S d ,j = 1,..., J, S denotes Dirac’s function and the approximation is made at J points of the 

unit sphere in R d . The meaning of (4) is that we define a finite partition Ai, ..., Aj of § d_1 , points s ^,£ S d ~ x 
and construct P by placing mass r(A,) at so that: 


T (ds) = ^r^)^)}^). 
l=i 

1 Actually, there are skewness parameters /3 (t) that depend on the particular projection t. 
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Suppose 9 denotes the parameters. Since <Px(t ; 9) = exp { — f §d ip a (< t , s >)r(ds)} we obtain: 


or alternatively: 


tpx(r; 9) = exp 


3 =1 



( 5 ) 


where t a 

(.S^, S 


J 

- logv3x(r w ,s J ;0) = ^7?^a(< t,s w >),£ = 1,..., J, (6) 

j=i 

= (t^ 1 ), ...,t( j )) denotes a set of points where the log-characteristic function is evaluated and s J = 
If we define: 


Y(t k ,s j ) = [-logipx{T {1) , s J ;9), ...,-logipx{T {K) , s J ;9)]' = 
[lx s J ] 9), I x {t^ k \ s j ; 9))' 

X(r K ,s J ) = [^ ij (r( i \s^)\ 

^j(r (i) ,s w ) = -0a(< rW,s® >),i = 1, = 1,J, 


we can write ( 6 ) as a system of linear equation^ 


Y (t k ,s j )=X(t k ,s j ) 1 , (7) 

from which, in principle, we can obtain approximations to the spectral weights, 7 j which, in (7), we collect 
in vector 7 . In practice, the system of equations suffers from singularities and the estimates of 7 are not always 
non-negative. The reason that X T X is often singular is that when a full grid is used, we encounter points where 
Ix (— t ) = Ix(t). Nolan, Panorska and McCulloch (2001) propose such symmetric grids around the basic directions 
(1,0), (0,1), (-1,0), and (0,-1) corresponding to independent components in the bivariate case. Generally we would 
need J oc 2 d in the d —dimensional case if we need to explore the measure around coordinates corresponding to 
independent components. This is manageable in dimensions up to 10. Of course the possibility arises that we may 
actually need a value of J that is significantly lower since the spectral measure can be embedded in a much smaller 
subspace. McCulloch (1994, 2000) has proposed the use of quadratic programming imposing the non-negativity 
and Nolan, Panorska and McCulloch (1997) report that, at least in small dimensions, the procedure works well. 

2 The system is in general complex-valued so we need to take the real and imaginary parts of Y and X. 
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The problem is challenging in that a double grid has to be specified, t a for the set of points to evaluate the log- 
characteristic function and s J for the support of the discrete measure. Apparently in all but very low dimensions 
(d = 2 specifically) if we were to use a full grid in (7) we would face the curse of dimensionality as matrix X would be 
huge. Therefore, the only choice is to place the points t k , s J in a “wise” maimer without sacrificing computational 
ease. 


3 A hierarchical model for multivariate stable distributions 

Matsui and Takemura (2009) extended the work of Abdul-Hamid and Nolan (1998) and provided semi-closed 
expressions for the class of multivariate stable distributions. Abdul-Hamid and Nolan (1998) used higher order 
derivatives of one-dimensional densities. The role of the following function is important: 


{ (27t) d / 0 °° cos (yu — {/3tan yy-} u a ) u d 1 exp(— u a )du, a ^ 1, 

( 8 ) 

(27r) -d J 0 °° cos (vu + ^fiulogu) u d ~ 1 exp(—u)du, a = 1. 


By theorem 1.1 in Matsui and Takemura (2009) due to Theorem 1 in Nolan (1998), the density of X ~ ^a,d{C: T), 
where £ £ is a shift parameter, can be expressed as: 




V 1 9a,d P(sj) <7(S)~ d dS , a ^ 1, 

r ( <x-C,S>-M(S)-f/3(S)<T(S)loga(S) \ , 

J§d—i 91 ,d (-- ,P(S) 1 a(S) dS, a = 1. 


( 9 ) 


Here, the parameters 


r(S) = 


(J i \<t,S>\ a T(dt^j ' , 


( 10 ) 


0(S) = a{S)- a f sgn(< t, S >) • | < t, S > | a T(dt), 


( 11 ) 
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and in these expressions t € K. d is the angle, see Samorodnitsky Taqqu (1994, example 2.3.4) and Matsui and 
Takemura (2009). Notably, the location or shift parameter is zero when a / 1 and a certain constant independent 
of £ when a — 1. We can certainly assume £ = 0. The role of £ arises mainly when the distribution is symmetric 
around £, see Corollary 4 in Abdul-Hamid and Nolan (1998). For the definition of location when a = 1 see equation 
(6) in Matsui and Takemura (2009) or equation (9) in Abdul-Hamid and Nolan (1998). The advantage of the 
expressions is that function g a d “is a function of two real variables no matter what the dimension d is, and that 
this function is the same for every a-stable r.v. A; i.e., it is independent of the spectral measure” (Abdul-Hamid 
and Nolan, 1998). 

It is possible to represent the density in terms of functions similar to those used by Zolotarev (1986, equation 
2.2.18, p.74 ) to convert the range of integration to a finite interval (whose upper bound is one) and avoid the 
infinite oscillations caused by the trigonometric terms (Buckle, 1995). It is clear, however, that latent variables s 
can be defined so that when X t ~ iid t = 1, ...,n, then from (8) and (9) we have: 


p (X t \S t ,6) oc g a ,d {v t ;P(S t )),t = l,...,n, 


( 12 ) 


v t = 


<X t ~( , S t > 
cr(St) 


(13) 


S t ~^(S d - 1 ),t = l,...,n, 

and ^(S^ -1 ) denotes the uniform distribution over the unit sphere. We remark that 

\ !/« 

| < S t ,c> \ a T(dc) j , 

p(S t ) = <r(St)~ a [ sgn(< St, c >)| < S t , c > |°T(dc) (16) 

s ^- 1 

The representation in (10)-(12) is a hierarchical model for a general multivariate stable distribution. Of course, 
the hierarchical model involves the spectral measure, r(d5 l ), through the functions f3(S) and o-(S'). These functions 
can be computed either via direct integration over the unit sphere in or by simulation (when S ~ T in their 



(14) 


(15) 
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computation). It is possible to use further augmentation of the model by latent variables which is, however, not 
recommended since it will affect seriously the mixing properties of MCMC. See for example the integrals over a 
bounded interval used by Matsui and Takemura (2009) to obtain the g aj d function. These correspond to similar 
expressions in Zolotarev (1986, pp. 76-77) and Buckle (1995). Although numerical integration is facilitated it does 
not avoid the problem of the presence of the spectral measure, T. Eventually, the density (9) has to be evaluated 
using v t in (14) and as Matsui and Takemura (2009) note the singularities are not completely avoided and further 
work is needed to deal with them as in section 3 of Nolan (1997) or the work reported here. 

Computation of ( 8 ) is a subtle matter. We proceed as follows. Since the cosine function cos(;r) = 0 at x = 
for k = 0, 1, |, 2,... we locate the roots of the equation vu — {f3{s t ) tan ^ }u a = for k = 1, §, 2,... with 

respect to u. Denote these roots by ri,r 2 ,.... The largest value of k is determined by the weight factor in ( 8 ), viz. 
f(u) = tr _1 exp(— u a )du so that the value of the weight factor at a certain u is less than £ = 10 -7 relative to its 
mode. The weight factor can be modified using the transformation w = u a in which case w follows a standard 
gamma distribution with shape parameter —. The mode is to = — l) 1 ^ so in practice we can set u = Cm and 

determine the constant C so that the weight factor is sufficiently small. Therefore the integral in (15) is computed 
in the intervals [0, r*i], [r*i, 7 * 2 ], etc as follows: 


9 a ,d( v ’P( S t)) = 


(17) 


(27r) d J2k= 1 Irk 1 cos ( Vwl ^ a — {/?(<Sf) tan w ) w » 7 exp(— w)dw 

after a change of variables to w = u a 1 where ro = 0 and rj > w = 10 (^ — l). This is always possible since the 
cosine function has an infinite number of roots. We use 20-point Gaussian quadrature to compute the integrals in 
the intervals [rk~i, Tk] determined by the roots of Xk(u) = vu— {/3(s t ) tan u a — ^ = 0. An alternative stopping 
criterion we used is when the roots ry — ry_ 1 < e = 10 ~ 4 so that the contribution to the integral in ( 8 ) is trivial. We 
have found that locating the roots is extremely easy provided we can locate the root of Xi(' u ) = 0 which requires 
a good starting value. Then the root is an excellent starting value to locate rk +1 using a standard Newton 
algorithm with analytical derivatives. The root r\, viz. Xi(ri) = 0 can be located using bisection. Moreover, 
Gaussian quadrature was found to work well. This procedure takes account of the oscillations of g a ,d{v, fi(st)) and 
is quite efficient in computing its values. In practice, too many zero points are needed and, therefore, it is better 
to utilize the finite integral representation in Matsui and Takemura (2009) and Abdul-Hamid and Nolan (1986). 
Our results in artificial and actual data were compared with 30- and 40-point quadrature and we found no essential 



differences. We were not able to find different results when the finite integral representation mentioned above was 
used. However, the form of the integrand suggests an adaptive quadrature scheme in a finite interval, until specified 
tolerance is achieved. This method is comparable in terms of accuracy, but less efficient compared to Matsui and 
Takemura (2009) and Abdul-Hamid and Nolan (1986). 

Here, we propose to approximate the spectral measure by 


C-t ^ ^ I) , fy. 


viz. a multivariate normal over the unit sphere, where p and to are unknown parameters and 1 ^ is the unit 
vector in R d . The augmented posterior of the model conditionally on T can be written in the form: 


p(a,{5' t },C|r, A) oc 


(18) 


n?=i 9 a , d v a ,r(S t )~ d ■ I(S t G S*" 1 ) ■ p(a, 0, 

where p(a,Q denotes the prior and the notation a , a ,r{St), Pa,r(St) makes explicit the dependence on a and 
the measure T. The variables St are treated as latent variables with a uniform prior over S d_1 and, provided the 
measure T is approximated with a normal distribution the only unknowns parameter are p and to for which we 
adopt a prior of the form: p(p,u) oc w -1 . Then, from (13) and (14) we have: 


a a ,r(S t ) a = E c ^ r \ <S t ,c>\ a . 


(19) 


We notice that: 


/?a,r (St) = 


-Llx(S t ) 
a(St) a tan ^ 


a ^ 1, 


L{Ix(.2St)-2Ix(St)} 
-ia(St) Inf 


4 MCMC scheme 


( 20 ) 


The great advantage of (16) is that the measure enters implicitly through the functions a a ,r(St ), Pa,r(S t ) only. 
With the Discrete Approximation of the spectral measure, the computation of functions in (17) and (18) is trivial. 
When the measure T is approximated with a normal distribution the only unknown parameters are p and to for 
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which we adopt a prior of the form: p(/i,o;) oc u: 1 . The expectations can be approximated as follows: 

M 

(JocAStT = M - 1 ^ I < St,c m > |“, (21) 

m=1 

where c m ~ ikL/F(/x • Id, 0J 2 Id), i = 1, The expectation in (19) can be computed easily once we have (4) 

or the solution to (7) and can be obtained from (18). 

For the Discrete Approximation denote the parameters by 6 = (a, £, 7 ). For the Normal Approximation the 
parameter vector is 9 = (a, £, /x, w) and we also have the latent variables {s t , t = 1, n} which are absent from the 
discrete approximation. In the Discrete Approximation, our prior has the form: 

7 |<A ~ £l), 

subject to the constraints that 7 > 0 where vo > 0 is a constant and <j> is defined below. To proceed, we write (7) 
in the following form: 


Y(r K , s J ) = , s J ) r y + U, 


( 22 ) 


wherq)j U ~ Nkj(0,4> 2 ) is an error term that denotes the deviation between the empirical and theoretical log- 
characteristic functions. The bar denotes that the t k are fixed. Combining with (16) we have the posterior 
resulting from the Discrete Approximation: 


p(a, C, {St}, r|X) (X n?=i 9a, d a a , r (S t )~ d ■ p(a, ()■ 

<£ (n+1) exp |-2[(Y - X7)' (Y - X7) + wYj \} ■ 1(7 > 0). 


(23) 


The posterior resulting from the Normal Approximation is the following: 


P(oc, C, {St}, n, U)\X) OC nr=l 9a, d ’ P a ’“( St }) a a,u( S t) d ■ p{a, ()■ 

^- ( " +1 >exp{-^£” = i SjS t }-T[? = 1 I(S t eS^ 1 ). 

3 Matrices Y and X are redefined so that they contain their real and imaginary parts. 


(24) 
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We assume throughout that p(a,Q oc 1(0 < a < 2). It appears that there are at least three novelties in the 
formulation of these posterior distributions. 

1. The formal treatment of (7) in the context of (22) which facilitates considerably the posterior 
estimation of spectral weights, 7. 

2. The introduction of latent variables {S^} to avoid integration over § d_1 in (9). 

3. The normal approximation to the measure T in connection with (9). 

The proper prior on 7 facilitates the regularization of the troublesome matrix X'X through the parameter m, and 
positive values of 7 are guaranteed through the truncation. It is well known that the matrix is ill-conditioned in 
the univariate case when J is moderately large, due partly to the fact that placing optimally the support points is 
a difficult problem. See Koutrouvelis (1980, 1981) and Madan and Seneta (1987) among others. It seems equally 
difficult to find “optimal” placements in the multivariate case. In this paper we examine sensitivity of posterior 
results to alternative configurations of t k . Nolan, Panorska and McCulloch (2001) report that the matrix is well- 
behaved in the two-dimensional case even with fine grids. Our experience is similar and implies that one needs to 
avoid points where the real and imaginary parts of Ix(t) have (numerically) the same value. 

Of course, drawing 7 from the second term in (24) is straightforward. It is required to draw from: 

7 ~ Nj( 7, V )>7 > 0, 

(25) 

7 = (X'X + ml^X'Y, V = </> 2 (X'X + ml)- 1 . 

This proposal is accepted with certain probability given by the first term of (24) to maintain the correct posterior 
distribution. Our MCMC algorithms (whose details are presented in Appendix A) are as follows. 
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MCMC — Discrete Approximation 


i. Draw 7 = 7* using the normal linear model in (24). Denote the normal proposal by g 7 ( 7). 

ii. Compute (19) and (20) and therefore (22). 

iii. Accept the draw with certain probability. 

iv. Update 9 using (22) with given s t = s t using proposal qg{9). Accept the candidate with certain 
probability. 

v. Update s t . 

MCMC — Normal Approximation 

i. Propose Sf ~ <yf£(/x • Id, w 2 /). Denote the proposal by q(s^\9,X). 

ii. Compute (19) and (20). 

iii. Accept the draw with certain probability based on (23). 

iv. Draw 9 * ~ qg{ 9 ). Accept the candidate with certain probability based on (23). 

There are certain numerical issues to resolve. First, the choice of the number of support points, J, is made formally 
using the log-marginal likelihood, for the MCMC-DA. Second, for the parameters 9 we use as proposals Student -1 
densities, centered at the maximum likelihood quantities for a, C, derived from fitting univariate stable symmetric 
distributions. For parameter u> we use as proposal: St ~ \ 2 n . The univariate stable symmetric density is 

computed using McCulloch’s (1998) method. ML estimates of ( and a diagonal covariance matrix times a constant 
h is used in constructing the proposal for £. For a we take the average of ML estimates with variance h times the 
median variance from ML estimates. We adjust h during the transient or “burn-in” phase to get acceptance rates 
between 20% and 30%. 

Third, To construct a proposal g 7 ( 7 ) for the (normalized) spectral measure, we begin with draws from standard 
uniform, let the MCMC algorithm run through its transient phase and run it again for another S Q = 50,000 
iterations. Then we use uniform proposals in intervals of the form [a, b] where a and b are determined from the 
99% probability intervals during the S a phase. The termination of transient phase is determined using Geweke’s 
(1992) diagnostics every 10,000 passes by comparing the first and last 2,500 draws. In artificial samples, depending 
on the parameters, we needed 50,000 to 150,000 draws. The results are not reported to save space but a separate 
appendix is available on request. All reported results are based on the final 100,000 draws by thinning every other 
tenth draw. 
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Fourth, the number of simulations, M, to approximate the functions /3(s) and cr(s) is set to M = 5,000 and 
we check in preliminary numerical work whether this is sufficient. Depending on the values of a and s reasonable 
precision can be achieved with M ranging from 500 to 2, 500. For some computational details see the first paragraph 
of Section Al in the Appendix. 

Fifth , in (22) and (6) or (7) we need to pre-select the configuration |t a } where the log-characteristic function 
is computed. We can use, again, points in § d_1 as in McCulloch (1994). We opt for 

r« ~ JVd( 0, 1), < r (l) , r« >= 1, i = 1,..., K, 

instead of a uniform distribution. The reason is that we need to concentrate such points near the origin (Madan 
and Seneta, 1987, Koutrouvelis, 1980, 1981, Xu and Knight, 2010, and Yu, 2007). We set K = 10J so we have 
ten times as many points to evaluate the log-characteristic functions than the number of spectral weights. Sixth, 
the prior parameter vo = 0.01 which produces a sufficiently diffuse prior for the spectral weights although we also 
examine alternative values for this parameter. 

5 Empirical application 

We consider ten currencies against the US dollar over the period July 3 1996 to May 21 2012 (see Tsionas, 2012). 
The currencies are Canadian dollar, Euro, Japanese yen, British pound, Swiss franc, Australian dollar, Hong-Kong 
dollar, New Zealand dollar, South Korean won and Mexican peso. The data is daily and was converted to log 
differences. The data are filtered using an AR(1)-GARCH(1,1) model. MCMC is implemented using a preliminary, 
transient phase of length 20,000. Then we take another 100,000 draws from the posteriors of Discrete and Normal 
Approximations. The number of support points J for each s t is determined by running different MCMC chains and 
computing the approximate log-marginal likelihood using the method of Lewis and Raftery (1997) and DiCiccio et 
al. (1997). For computational details see paragraph 3 in Section Al of the Appendix. 

Student -1 proposal densities (with 10 degrees of freedom) were tuned to provide acceptance rates between 20% 
and 30% for the latent variables. From the results in Table 1 it turns out that Bayes factors favor J = 20 points for 
several values of m so we choose this value to proceed further with Bayesian analysis with w = 0.01. The matrix 
X T X was not found singular in all but exceptional cases where approximately (numerically) grids were generated. 
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In such cases the parameter w resolves the problem. This computational experience is consistent with the results 
in Nolan, Panorska and McCulloch (2001). In their paper they mention that one needs to scale the data by the 
median of \Xt\. Similar observations were made by Meerschaert and Scheffler (1999) and Tsionas (2012b). Here we 
followed the same approach. 
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Table 1. Bayes factors relative to J = 5 


J —> 

10 

15 

20 

30 

40 

50 

137 = 10 -4 

27.12 

33.87 

71.23 

40.01 

22.1 

12.67 

137 = 10 -3 

29.03 

35.55 

81.59 

32.33 

10.01 

6.50 

vo = 0.01 

32.33 

37.10 

82.03 

20.93 

9.12 

4.33 

137 = 0.1 

31.44 

40.32 

95.44 

14.32 

1.83 

2.16 


In panel (a) of Figure 1 we report marginal posterior densities of a using the Discrete and Normal approximation 
for a fixed configuration t k . Sensitivity of marginal posteriors with respect to ten alternative configurationtjj is 
examined in panel (b). Marginal posteriors for different values of K are given in panel (c) up to I\ = 10 J. Clearly, 
as K increases marginal posteriors of a behave in the same way. 

Finally, in panel (d) we report posterior means of the Discrete Approximation to the spectral measure (thick 
line) and typical posterior means from the Normal Approximation under various configurations of t k . In fact, the 
plot underestimates the Normal’s ability to approximate closely the discrete measure (which we take as the “true” 
measure or its the best approximation) because the discrete measure also changes with the configuration. Overall, 
the results are quite robust reasonable and it is, indeed, encouraging that the Normal Approximation behaves so 
well provided, of course, that the multivariate characteristic function is evaluated at a large enough number of 
points. Moreover, the Bayes factor resolves successfully the problem of selecting the number of support points, J, 
for the latent s t . 


4 Autocorrelation of parameter draws is non-trivial as expected. The maximum autocorrelation ranged from 0.20 to 0.50 at lag 50 
for the latent variables but was significantly lower for the structural parameters 6 ranging between 0.10 and 0.30. So there is enough 
evidence that the chains mix well. 
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Figure 1. Empirical results for multivariate stable distribution, exchange rate data 


a. Marginal posteriors of a 
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b. Marginal posteriors of a, different x K (K=20) 
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d. Normalized spectral measures 



To study the association between multivariate stable random variables, the role of the spectral (Levy) measure 
(T) has been found critical (Mittnik, Rachev and Ruschendorf, 1999). The dependence-at-the tails function can be 
estimated non-parametrically or using the posterior mean of the Levy measure which has been computed. Technical 
details on computing the Mittnik, Rachev and Ruschendorf (1999) function, m(a;), are too many to reproduce here, 
so we refer instead to their paper. Specifically, the dependence function is defined in their (2.7) using (2.6) with the 
spectral measure involved directly in (2.7) and (2.4). As they mention: “If an explicit parametric model is assumed, 
it would be more natural and efficient to use the rank-order process or the dependence function itself” (p. 184) 
which is precisely what we do here. 

Our empirical results are reported in Table 2. We report posterior means of dependence at the tails. The tail 
is defined as the upper 5% percentile. Each dependence measure is the posterior mean from MCMC simulation. 
The tail dependence measures that include zero in the Bayes HPD are not reported. Diagonal elements are not 
reported. 
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Table 2. Empirical results: Tail-Dependence measures 

We report the Mittnik, Rachev and Riischendorf (1999) dependence measure at the tails. The tail is defined as the upper 5% percentile. Each 
dependence measure is the posterior mean from MCMC simulation. Tail dependence measures that include zero in the Bayes HPD are not 
reported. Diagonal elements are not reported. 



CAD 

EUR 

JPY 

GBP 

CHF 

AUD 

HKD 

NZD 

KRW 

MXN 

CAD 


0.625 


0.742 







EUR 



0.645 

0.815 

0.713 

0.613 

0.511 

0.662 

0.772 

0.341 

JPY 




0.740 

0.855 

0.410 



0.815 


GBP 





0.603 


0.722 

0.825 

0.649 

0.328 

CHF 






0.515 



0.501 


AUD 

0.610 







0.717 


0.332 

HKD 









0.423 


NZD 










0.756 

KRW 










0.336 

MXN 












Dependence at the tails is, obviously, quite large with the Euro associated with most currencies followed by the 
GBP and JPY. Non-parametric measures computed as in Mittnik, Rachev and Riischendorf (1999) are somewhat 
different, showing that relying explicitly on multivariate stability delivers some gains in terms of efficiency assuming, 
of course, the model is a better description of reality. To our knowledge this is the first application of the tail- 
dependence measure provided an explicit Levy measure T is used. This measure is computed here as the posterior 
mean from Bayes MCMC simulation. 

Given the empirical results it does not appear possible to remove any currency from the multivariate vector due 
to its weak dependence or no dependence at all to other currencies. This is despite the fact that we have allowed 
for an AR(1)-GARCH(1,1) scheme. Removing GARCH effects which are prevalent in many financial time series is 
essential in order to satisfy, at least approximately, the i.i.d. assumption involved in the analysis of multivariate 
stable distributions. 
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Concluding remarks 


In this paper we break new ground in the treatment of multivariate stable distributions along the following lines. 
First, we propose a normal approximation to their spectral measure that seems to work very well in practice. Second, 
we propose efficient MCMC techniques by introducing appropriate latent variables. These are estimated from the 
data along with the spectral measure. Third, in connection with the important per se discrete approximation of the 
measure, we estimate it in the context of the simple normal linear model based on the log-characteristic function. 
The normal approximation reduces considerably the computational burden without sacrificing, as it seems, the 
quality of the approximation to the benchmark provided by the discretization of the spectral measure. The fact 
that it works well in a data set with ten variables and almost 4,000 observations is quite encouraging in terms of 
applications of multivariate stable distributions. 

Posterior inferences seem to be quite robust with respect to the configurations of t k . Provided these are normally 
distributed over the unit sphere in R d , our MCMC schemes mix well with respect to the structural parameters and 
latent s J . 

References 

Abdul-Hamid, H., and J.P. Nolan, 1998, Multivariate stable densities as functions of one-dimensional projections, 
Journal of Multivariate Analysis 67, 80-89. 

Bazant, Z.P., and B.H. Oh, 1986, Efficient numerical integration on the surface of a sphere, Zeitschrift fur 
Angewandte Mathematik und Mechanik 1, 37-49. 

Belisle, C., H. Romeijn, and R. Smith, R., 1993, Hit and run algorithms for generating multivariate distributions, 
Mathematics of Operations Research 18, 255-266. 

Broda, S.A., Haas, M., Krause, J., Paolella, M.S., and S.C. Steude, 2012, Stable mixture GARCH models, 
Journal of Econometrics, in press. 

Buckle, D.J., 1995, Bayesian inference for stable distributions, Journal of the American Statistical Association 
90, 605-613. 

Byczkowski, T., J.P. Nolan, and B. Rajput, 1993, Approximation of multidimensional stable distributions, 
Journal of Multivariate Analysis 46, 13-31. 

Cambanis, S., and G. Miller, 1981, Linear problems in pth order and stable processes, SIAM Journal of Applied 


18 



Mathematics 41, 43-69. 


Cheng, B.N., and S.T. Rachev, 1995, Multivariate stable futures prices, Mathematical Finance 5, 133-153. 

Chib,S. and E. Greenberg, 1995, Understanding the Metropolis-Hastings algorithm, The American Statistician 
49, 327-335. 

DiCiccio, T. J., R.E. Kass, A. Raftery, Adrian, L. Wasserman, 1997, Computing Bayes Factors by Combining 
Simulation and Asymptotic Approximations 92, 903-915. 

Dominicy, Y., and D. Veredas, 2012, The method of simulated quantiles, Journal of Econometrics, in press. 

Garcia, R., E. Renault, and D. Veredas, 2011, Estimation of stable distributions by indirect inference, Journal 
of Econometrics 161, 325-337. 

Geweke, J.. 1992, Evaluating the accuracy of sampling based approaches to the calculation of posterior moments, 
in J.O. Berger, et al. (eds.), Bayesian Statistics, Vol. 4, pp. 169-194. Oxford: Oxford University Press. 

Hallin, M., Swan, Y., Verdebout, T., and D. Veredas, 2012, One-step R-estimation in linear models with stable 
errors, Journal of Econometrics, in press. 

Koutrouvelis, I. A., 1980, Regression-type estimation of the parameters of stable laws, Journal of the American 
Statistical Association 75, 918-928. 

Koutrouvelis, I. A., 1981, An iterative procedure for estimation of the parameters of stable laws, Communications 
in Statististcs - Simulation and Computation 10, 17-28. 

Lewis, S.M. and A.E. Raftery, 1997, Estimating Bayes’ factors via posterior simulation with the Laplace- 
Metropolis estimator, Journal of the American Statistical Association, 92, 648-655. 

Lombardi, M. J., and D. Veredas, 2007, Indirect estimation of elliptical stable distributions, Computational 
Statistics and Data Analysis 53, 2309-2324. 

Madan, D.B., and E. Seneta, 1987, Simulation of estimates using the empirical characteristic function, Interna¬ 
tional Statistical Review 55, 153-161. 

Matsui, M., and A. Takemura, 2009, Integral representations of one-dimensional projections for multivariate 
stable densities, Journal of Multivariate Analysis 100, 334-344. 

McCulloch, J. H. , 1994, Estimation of bivariate stable spectral densities, Technical Report, Department of 
Economics, Ohio State University. 

McCulloch, J. H., 1998, Numerical Approximation of the Symmetric Stable Distribution and Density, in R. 


19 



Adler, R. Feldman, and M. Taqqu (eds), A practical guide to heavy tails: Statistical techniques for analyzing heavy 
tailed data, Boston, Birkhauser, 489-500. 

McCulloch, J. H. , 2000, Estimation of the bivariate stable spectral representation by the projection method, 
Computational Economics 16, 47-62. 

Meerschaert, M.M., and H.-P. Scheffler, 1999, Moment estimator for random vectors with heavy tails, Journal 
of Multivariate Analysis 71, 145-159. 

Mittnik, S., S.T. Rachev and L. Riischendorf, 1999, Test of association between multivariate stable vectors, 
Mathematical and Computer Modelling 29 (10-12), 181-195. 

Nagaev, A., 2000, On non-parametric estimation of the Poisson spectral measure of a stable law, Journal of 
Mathematical Sciences 106, 2854-2859. 

Nolan, J., 1997, Numerical calculation of stable densities and distribution functions, Communications in Statis¬ 
tics - Stochastic Models 13, 759-774. 

Nolan, J. P., 1998, Multivariate stable distributions: approximation, estimation, simulation and identification. 
In R. J. Adler, R. E. Feldman, and M. S. Taqqu (Eds.), A Practical Guide to Heavy Tails: Statistical techniques 
for analyzing heavy tailed data, pp. 509-526. Boston: Birkhauser. 

Nolan, J. P., A. Panorska, and J. H. McCulloch, 2001, Estimation of stable spectral measures, Mathematical 
and Computer Modelling 34, 1113-1122. 

Nolan, J. P. and B. Rajput, 1995, Calculation of multidimensional stable densities, Communications in Statistics 
- Simulation 24, 551-556. 

Ogata, H., 2012, Estimation for multivariate stable distributions with generalized empirical likelihood, Journal 
of Econometrics, in press. 

Paulauskas, V. I., 1976, Some remarks on multivariate stable distributions. Journal of Multivariate Analysis 6, 
356-368. 

Pourahmadi, M., 1987, Some properties of empirical characteristic functions viewed as harmonizable processes, 
Journal of Statistical Planning and Inference 17, 345-359. 

Press, S. J., 1972, Estimation in univariate and multivariate stable distributions, Journal of the American 
Statistical Association 67, 842-846. 

Ravishanker, N., and Z. Qiou, 1999, Monte Carlo EM estimation for multivariate stable distributions, Statistics 


20 



& Probability Letters 45, 335-340. 

Roose, D., and E. De Doncker, 1981, Automatic integration over a sphere, Journal of Computational and Applied 
Mathematics 7, 203-224. 

Samorodnitsky, G., and M. S. Taqqu, 1994, Stable non-Gaussian random processes, Chapman and Hall, New 
York. 

Stroud, A., 1971, Approximate Calculation of Multiple Integrals, Prentice Hall. 

Tierney, L., 1994, Markov chains for exploring posterior distributions, Annals of Statistics 22 (4), 1701-28. 

Tsionas, E.G., 1999, Monte Carlo inference in econometric models with symmetric stable disturbances, Journal 
of Econometrics 88, 365-401. 

Tsionas, E.G., 2012, Estimating tail indices and principal directions easily, Statistics & Probability Letters 82, 
1986-89. 

Zolotarev, V.M., 1986, One-dinrensional stable distributions, AMS Translations of Mathematical Monographs, 
vol. 65, AMS, Providence, Long Island. 

Xu, D., and J. Knight, 2010, Continuous empirical characteristic function estimation of mixtures of normal 
parameters, Econometric Reviews 30, 25-50. 

Yu, J., 2007, Empirical characteristic function and its applications, Econometric Reviews 23, 93-123. 


Appendix A. Computational details 

Al. Computation of skewness coefficients a(s) 

1 . 

See Bazant and Oh (1986) and Roose and De Doncker (1981). See also Stroud (1971) for the multidimensional case 
since the emphasis in Roose and de Doncker (1981) and Bazant and Oh (1986) is on S 2 . Related Fortran code in 
Stroud (1971) is SPHERE_05_ND. In § 2 specialized techniques were compared against a hit-and-run algorithm to 
generate points uniformly distributed over the spheres. In higher dimensions we compared using Stroud’s (1971) 
rules. The computational cost of hit-and run algorithm (Belisle et al, 1993) is trivial even in high dimensional 
spaces. In dimensions up to 20 we have found that M in the range 500-2,500 provides acceptable results. 
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2 . 


Next, we provide computational experience with the integral <r(s) a = Ld-i \ < s,t > \ a T(dt) when the spectral 
measure corresponds to yl^(0,I). The question is, therefore, how to approximate this multivariate integral accu¬ 
rately and efficiently. The particular measure is of importance since we use it to approximate the unknown measure. 
The integral is approximated as <r(s) a ~ Af _1 J2m= 1 1 < s,t m > \ a where t 1; ...,t M ~ ikLd))(0,/) . To assess the 
approximation we select 1,000 points s uniformly distributed over the unit sphere S d_1 and we report the median 
absolute errors for different values of d, a and M. The “exact” value of er(s) a is obtained using M = 10 6 draws. 
The “exact” value is biased as the support of T is § d_1 but it is simulation-consistent. Draws uniformly distributed 
in the unit sphere are obtained efficiently using the hit-and-run algorithm of Belisle et al (1993). The algorithm is 
implemented by running 11,000 iterations and keeping the last 1,000. Convergence was (successfully) tested using 
Geweke’s (1992) diagnostic. The results are reported in Table Al. 
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Table Al. Median absolute errors of approximation. 



The results are encouraging since for large d even 10 or 100 simulations are sufficient to get the integral with 
sufficient accuracy. For d = 10 we need M = 500 and 5,000 seems to be quite sufficient. Since <r(s) and /3(s) are 
only needed in the context of MCMC for acceptance Metropolis-Hustings probabilities the accuracy reported here 
is sufficient. 
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3. 


In (23) the troublesome part in computing the log marginal likelihood is integration with respect to s t . Since the 
posterior is evaluated only at the mean, it is relatively easy to integrate term-by-term using (9) and integration over 
the sphere (Bazant and Oh, 1986 and mainly Stroud, 1971). To implement the method of DiCiccio et al. (1997) 
we use “volume correction” by truncating the MCMC draws by 5% in the tails when approximating the posterior 
at the mean using a multivariate normal kernel. The dimensionality of the parameter space increases mainly due 
to ( € R d . We did not obtain results significantly different from those reported in Table 1 when we fixed these 
parameters to the median of log-differenced exchange rates, which are close to zero. 

A2. Updating support points of the spectral measure 

It is well known that the selection of points is a difficult problem even in the univariate case. See Madan and Seneta 
(1987), Koutrouvelis (1980, 1981), Pourahmadi (1987), Xu and Knight (2010) and Yu (2007). Generally, a full grid 
is not optimal.For the Discrete Approximation the support points are s J = [s^,..., where £ § d_1 , for 

j = 1,..., J. As a proposal we use ^ MO(0, hi)\ < s^\ >= 1 where h > 0 is a tuning constant. The choice 

h = 0.5 performed slightly better than h = 1. To improve the mixing of MCMC, given that this proposal performed 
rather well we decided to replace the Metropolis-Hastings step with the so called accept-reject Metropolis-Hastings 
algorithm (Tierney, 1994, section 2.3.4, Chib and Greenberg, 1995, section 6.1). If the algorithm is currently at 
s J Q and a candidate or proposed move is to si, define D = (a|p(s|6 ) ,Y) < cq(s) as the set where the posterior 
conditional is dominated by the proposal cq(s). Then the acceptance probability: 


a(s„,s*) = { 


1, si £ D, 


Cg(Sp) 

p(s^|0,A')’ 


D,si eD 


p{sj\8,X)q(s£) J 
p( S J\e,X)q( S J) > *o 


D,si 


A 


where q(s J ) denotes the normal proposal distribution and p(s J \9,X) is the conditional posterior of s J from 
the Discrete approximation. To improve mixing, the support points are drawn as a group and by construction the 
move from s J D to si is certain. In the exchange rate data set the average number of rejections required to obtain a 
candidate si was 20. The accept-reject Metropolis-Hastings algorithm performed slightly better in artificial data 
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with the same dimension d = 10 and the average number of rejections was about 15. About 50% of the time we 
use an acceptance test since we have dominance. In artificial data this was between 60 and 80% of the time. 

For the latent variables St in the Normal Approximation, the “natural” proposal is a uniform distribution over 
S d_1 . Other proposals are hard to come by since g at d involves computation of a (univariate) integral and the 
computation of cr au (S t ). To facilitate better mixing we adopt the following strategy. Given the posterior from [I] 
we use M = 100 to obtain an approximate value of <r a w (S t ). Given the current draw S° we use one iteration of 
the Gauss-Newton method to obtain: 


St = S° t - [V 2 logp(S'°|-)] 1 • Vlogp(5 t °|.), 

with numerical derivatives for the gradient and the Hessian. We propose a move to a candidate 


S* t ~JV d (s u -h- [V 2 logp(S?|-)] '), 

which is normalized to lie in S d_1 . The candidate is accepted with probability 


min 


P (s* t \e,8 J ,x)g{s°) 

'p(S°\e,sJ,X)q(S* t ) 


where p(St\6, s J ,X) is the conditional posterior of St from [Hand q(St) denotes the kernel of the normal proposal 
density. The normalizing constant of the proposal is unknown but it cancels out in the acceptance probability. We 
use again the accept-reject Metropolis - Hastings algorithm to guarantee a move from S° to some new S’*. The 
average number of rejections was 120 with h = 1. Using a normal proposal centered directly at the current S° was 
proved impractical since the average number of rejections was in excess of roughly 5,000. Occasionally, the proposal 
centered at St required more than 10,000 rejections. In such cases we abandon the accept-reject step, propose a move 
to S* ~ ^S°,— [V 2 logp(S°|-)] ^ and accept with the usual random walk Metropolis - Hastings probability: 

min |l, p[go|e^j’x) }• the exchange rate data set this occurred in roughly 5% of draws. In artificial data the 
proportion ranged from 0.1% to 1% depending mainly on the value of the characteristic exponent, a. 

Regarding updates of 7 from [4] the proposal from [4] was found to work very well. The acceptance probability is: 


\ , p(r\a,C,{St},X)q(r)r 
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where 


p(7KC,{St},^) °c g a ,d — f' St > ,Pa,r(St) \ cr a ,r(St) d , 

and < 7 ( 7 ) denotes again the normal proposal for 7 as in 01 

The spectral weights are updated as a group and application of the accept - reject Metropolis - Hastings 
algorithm required no more than 1,000 rejections maximum with an average close to 250 and rejection sampling 
close to 30% of the time. The downside is that n univariate integrals have to be computed to obtain g a ,d{') and 
an additional simulation is required to obtain <r(S). We have found that this is, indeed, computationally intensive 
but the advantage is significant: The MCMC scheme mixes very well contrary to an additional augmentation by 
latent variables as in Buckle (1995) or Tsionas (1999). Our preliminary results with data augmentation in d = 2 
were somewhat disappointing in the sense that (i) the accept - reject Metropolis - Hastings takes too long to update 
successfully, and (ii) the simple Metropolis - Hastings algorithm takes to stay far too often. We have of course 
attempted to tune the algorithms by a suitable constant in the covariance matrix in |4] but it did not seem possible 
to provide good acceptance rates. We did not attempt to examine performance in other dimensions although it is 
quite possible that other proposals for 7 might work better. One such proposal, that we have to leave for future 
work, is an importance density drafted along the one - step Gauss - Newton approach that we described earlier in 
connection to updating St- 

Proposal distributions for a can be devised in many ways. Here, we proceed as follows. Suppose t^) ~ ^ (S d_1 ) 
and consider the projections yt,(b) =< T^,X t >, for t = 1,..., n and 6=1,..., B. For each series {yt,(b),t = 1,..., n} 
we estimate the parameter a^ b ) and the location p,( b ) of univariate general stable distributions using maximum 
likelihood via the FFT. Skewness and scale parameters are functions of a through [3] and [3] but they depend on T 
so parameters /3( b ) and a^) are estimated as well. Suppose a = B^ 1 ^2b=i cfe(&) an d V a = B~ : 1 {a — a) 2 . Our 

proposal for a in connection with [4] or [4] is a normal with the indicated moments. We have used B = lOd = 100 
with the exchange rate data. For the shift parameters ( £ l' 1 : (i) During the “burn in” we use a uniform proposal 
in the interval (—a, a) d and a is adapted to obtain reasonable acceptance rates (between 20 and 30%). (ii) During 
the main phase we keep a fixed. 
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A3. An alternative proposal for the spectral measure 


Since the spectral measure, T, is of critical importance in multivariate stable distributions it is, perhaps, desirable 
to examine alternative proposal or importance distributions. From [5] given estimates of a and <r(s) and a discrete 
approximation for T, we have 


°(t) & = ■ I < S(j),t > r» 

3 = 1 

where s J = {s^),j = 1 ,...,/} denotes the points of the support corresponding to partition of the unit sphere. 
Their number, I, is not related to J. From this expression we obtain: 

i 

) 'y ; 7i ■ I ^ > s (j) 7 |", * = 1) •••) N, j = 1 

j=i 

where t N = {t(i), ...,t(jv)} denotes the set set of projections where the scale parameter is estimated. The idea 
here is to fix a set of . Then we know that < t ^, X t > follows a univariate stable distribution with characteristic 
exponent a and skewness and scale given as in [3] and [3] These can be estimated using maximum likelihood and the 
FFT to obtain the log - likelihood function. The system has the form: 

Aj = 6 , 

-A(NxI) = [ a ij]i a ij = I “7 t(i)y s (j) ^ | , 

b (Nxi) = [bi],bi = a (%))“,* = 1 ,-,N,j = 1,...,/, 

and it is somewhat easier to solve than the corresponding system in [ 2 ] which uses the characteristic function. 
More importantly, this system will be solved only once to construct a proposal for updating 7 in the context 
of MCMC. Specifically we choose / = 100 points S(j) ~ iidM / d(0, 1) || § d_1 and TV = 10 • I = 1,000 points 
S(j) ~ (S 6 ^ 1 ) and we use Bayesian inference with non-negativity constraints on the 7 , in the system b = Aj + u 

where u ~ ikLT/v(0, We use an adaptive Metropolis - Hastings algorithm (with 7 = 8 0 6 and S £ K 7 ) to 

perform the computation with 15,000 passes the first 5,000 of which are discarded. From the remaining draws we 
retain only every other tenth and we estimate the mean, 7 and the covariance matrix V from the MCMC draws. 

Since the log marginal likelihood of this model can be computed easily using the Laplace approximation, one 
can set, if desired, J = I* and I* attains the maximum value of the log marginal likelihood. Here, J is the 
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number of support points for the spectral measure that we use in the main text. It was made clear that we did 
not opt for this choice as it does not fully utilize the information in the data. However, it is useful in that it 
can provide a proposal distribution for the support points S( :/ j of the spectral measure. For the support points, in 
particular, this can be achieved as follows. The MCMC procedure in the linear system is repeated 100 times with 
100 different sets of s 1 = {s(j\,j = 1,for a fixed set of t N . The set that corresponds to the best value 
of the log marginal likelihood is {s*, 7 *}. This is repeated for 10,000 alternative sets of t N resulting in a set 
Sf/v = {s* (j),7*,(»),* = 1, ...,1V}. From this set whose variability is due to the different configurations of t N we can 
determine a joint proposal distribution for T = {s(j), 7 j, j = 1,...,/} which characterizes completely the spectral 
measure, T. The joint proposal distribution is a uniform for each element of T whose bounds are determined from 
the 99% probability intervals of Sf/v point-wise. 

This well-crafted proposal can be used to update jointly the spectral weights 7 and the spectral support points 
{s(j), 7 j,j = 1,...,/}. The number of support points, /, was set intentionally to a large number / = 100 although 
preliminary investigations revealed that the optimal I* defined earlier was much lower. Our MCMC scheme for 
inference in multivariate stable distributions is performed for fixed J. For each specific value of J for which we 
implement MCMC in the main text we have tried two alternatives. 

i. We set J = I and repeat the previous analysis so that we have an “exact” proposal for T that 
matches exactly with the spectral weights and support points that we need to update. 

ii. We use I = 100 throughout. For each specific value of J < J, we re-normalize ( $n and F discarding 
weights (and the corresponding support points) less than e/ which is determined by the requirement 
that we have I support points. Given the support points the weights that are less than ej are allocated 
to the retained support points that are closest (in the L\— norm) to the excluded support points. 

With the exchange rate data set the two alternatives did not behave in a qualitatively different way. The posterior 
estimates of structural parameters and more importantly of the spectral measure are quite close. This observation 
implies that MCMC explores the posterior quite satisfactorily once we have a “reasonable” procedure. 

Moreover, we have used a fixed set of projections which does not correspond to the t N used here, since 

we wanted to construct a procedure that can be applied more generally. We use, again, Tierney’s (1994) accept - 
reject Metropolis - Hastings. This, again, provides sufficient evidence that our MCMC explores the posterior quite 
well. 
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In Table A1 we report some statistics relating to the two proposals we have constructed. For the main parameters 
of the model we report the number of rejections in the accept-reject step to obtain a candidate when dominance holds 
and (in parentheses) the percentage of draws for which the accept-reject step was implemented (due to dominance) 
from the “blanketing” proposal. 
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Table A2. Statistics relating to the performance of proposals 



Proposal, section A2 

Proposal, section A3 

acf, lag 50* 

( 7 . {«{,)}) 

67 (43.5%)I 

12 (62.1%) 

0.21 0.15 

7 

25 (50.8%) 

32 (44.4%)! 

0.32 0.37 

I s U)} 

32 (43.2%) 

91 (12.3%)! 

0.18 0.22 

(«>C) 

15 (25.5%) 

7 (62.3%) 

0.25 0.35 


210 (21.3%) 

12 (33.7%) 

0.11 0.07 


Notes: ' In this case we update jointly ( 7 , {^(j)})- 
! In this case we update separately 7 and { s (y)} - 
II The reported numbers are medians for t = 1, n. 

* This is autocorrelation at lag 50. The first number refers to the proposal of section A2 and the 
second number refers to the proposal of section A3. 


Clearly the proposal of this section performs slightly better and “blankets” the posterior conditional distributions 
of the key parameters better. However, the important point is that the proposal of section A2 is quite competitive 
relative to this well-crafted proposal. It seems, therefore, that in practical applications the overhead of constructing 
proposal distributions by preliminary fitting of the projection - dependent scale parameter er(s) will, most likely, 
not be worth the effort. The results reported here refer to a single data set (the exchange rates) so we cannot 
generalize to all empirical applications although this would constitute an important matter for future research. 


30 





























